Rank Discriminants for Predicting Phenotypes from Rna Expression by Bahman Afsari1,∗, Ulisses

نویسندگان

  • M. BRAGA-NETO
  • DONALD GEMAN
چکیده

Statistical methods for analyzing large-scale biomolecular data are commonplace in computational biology. A notable example is phenotype prediction from gene expression data, for instance, detecting human cancers, differentiating subtypes and predicting clinical outcomes. Still, clinical applications remain scarce. One reason is that the complexity of the decision rules that emerge from standard statistical learning impedes biological understanding, in particular, any mechanistic interpretation. Here we explore decision rules for binary classification utilizing only the ordering of expression among several genes; the basic building blocks are then two-gene expression comparisons. The simplest example, just one comparison, is the TSP classifier, which has appeared in a variety of cancer-related discovery studies. Decision rules based on multiple comparisons can better accommodate class heterogeneity, and thereby increase accuracy, and might provide a link with biological mechanism. We consider a general framework (“rank-in-context”) for designing discriminant functions, including a data-driven selection of the number and identity of the genes in the support (“context”). We then specialize to two examples: voting among several pairs and comparing the median expression in two groups of genes. Comprehensive experiments assess accuracy relative to other, more complex, methods, and reinforce earlier observations that simple classifiers are competitive.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Rank Discriminants for Predicting Phenotypes from RNA Expression

Statistical methods for analyzing large-scale biomolecular data are commonplace in computational biology. A notable example is phenotype prediction from gene expression data, for instance detecting human cancers, differentiating subtypes, and predicting clinical outcomes. Still, clinical applications remain scarce. One reason is that the complexity of the decision rules that emerge from standar...

متن کامل

Splice Expression Variation Analysis (SEVA) for Inter-tumor Heterogeneity of Gene Isoform Usage in Cancer.

Motivation Current bioinformatics methods to detect changes in gene isoform usage in distinct phenotypes compare the relative expected isoform usage in phenotypes. These statistics model differences in isoform usage in normal tissues, which have stable regulation of gene splicing. Pathological conditions, such as cancer, can have broken regulation of splicing that increases the heterogeneity of...

متن کامل

TRIzol-based RNA Extraction: A Reliable Method for Gene Expression Studies

RNA extraction is a prerequisite technique for gene expression studies, analyzing the etiology and disease progression, treatment effects, as well as designing the diagnostic methods. Although many RNA extraction kits have been commercialized, but these kits are expensive and are not accessible in some countries. Many studies have shown that TRIzol is an applicable material for the RNA extracti...

متن کامل

Phenol Based RNA Isolation is the Optimum Method for Study of Gene Expression in Human Urinary Sediment

Evaluation of gene expression in urinary sediment has been considered as a promising non-invasive approach for biomarker identification of kidney diseases. Nonetheless, there are several challenges in extraction of RNA from this valuable source of biomarkers, mostly because of the factors that have influence on quality of isolated RNA such as low cellular content. Accordingly, we compared the q...

متن کامل

Invariants and polynomial identities for higher rank matrices

We exhibit explicit expressions, in terms of components, of discriminants, determinants, characteristic polynomials and polynomial identities for matrices of higher rank. We define permutation tensors and in term of them we construct discriminants and the determinant as the discriminant of order d, where d is the dimension of the matrix. The characteristic polynomials and the Cayley–Hamilton th...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014